-
Notifications
You must be signed in to change notification settings - Fork 367
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Don't interpret decayed data as we've failed to send tiny values #3368
Don't interpret decayed data as we've failed to send tiny values #3368
Conversation
When we're calculating the success probability for min-/max-bucket pairs and are looking at the 0th' min-bucket, we only look at the highest max-bucket to decide the success probability. We ignore max-buckets which have a value below `BUCKET_FIXED_POINT_ONE` to only consider values which aren't substantially decayed. However, if all of our data is substantially decayed, this filter causes us to conclude that the highest max-bucket is bucket zero even though we really should then be looking at any bucket. We make this change here, looking at the highest non-zero max-bucket if no max-buckets have a value above `BUCKET_FIXED_POINT_ONE`.
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #3368 +/- ##
==========================================
- Coverage 89.61% 89.58% -0.03%
==========================================
Files 127 127
Lines 103533 103538 +5
Branches 103533 103538 +5
==========================================
- Hits 92778 92754 -24
- Misses 8056 8076 +20
- Partials 2699 2708 +9 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGMT
let selected_max = highest_max_bucket_with_full_points.unwrap_or(highest_max_bucket_with_points); | ||
let max_bucket_end_pos = BUCKET_START_POS[32 - selected_max] - 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
slightly confused, mostly i am missing something.
iiuc, we used to "ignore max-buckets which have a value below BUCKET_FIXED_POINT_ONE"
and we no longer want to do that,
so why not just track max_bucket instead of tracking two max buckets? if they are valid regardless of the value they hold (as long as they're not zero)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We're picking the highest max-bucket which has some value to use as the max-bucket for calculation, but we don't really want to be looking at a really high max-bucket if its fairly far decayed and there's some other max-bucket that isn't (as) decayed. We could have some smarter heuristic here for that, of course, but CPU cycles in this code are very expensive so the naive over-under-one is picked.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lgtm!
When we're calculating the success probability for min-/max-bucket pairs and are looking at the 0th' min-bucket, we only look at the highest max-bucket to decide the success probability. We ignore max-buckets which have a value below
BUCKET_FIXED_POINT_ONE
to only consider values which aren't substantially decayed.However, if all of our data is substantially decayed, this filter causes us to conclude that the highest max-bucket is bucket zero even though we really should then be looking at any bucket.
We make this change here, looking at the highest non-zero max-bucket if no max-buckets have a value above
BUCKET_FIXED_POINT_ONE
.This was included in the 0.0.125 release.